ddim inversion
- North America > United States > Massachusetts (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Singapore (0.04)
- Information Technology (0.68)
- Leisure & Entertainment > Sports (0.46)
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Text-guided diffusion models have significantly advanced image editing, enabling high-quality and diverse modifications driven by text prompts. However, effective editing requires inverting the source image into a latent space, a process often hindered by prediction errors inherent in DDIM inversion. These errors accumulate during the diffusion process, resulting in inferior content preservation and edit fidelity, especially with conditional inputs. We address these challenges by investigating the primary contributors to error accumulation in DDIM inversion and identify the singularity problem in traditional noise schedules as a key issue. To resolve this, we introduce the, a novel noise schedule designed to eliminate singularities, improve inversion stability, and provide a better noise space for image editing. This schedule reduces noise prediction errors, enabling more faithful editing that preserves the original content of the source image. Our approach requires no additional retraining and is compatible with various existing editing methods. Experiments across eight editing tasks demonstrate the Logistic Schedule's superior performance in content preservation and edit fidelity compared to traditional noise schedules, highlighting its adaptability and effectiveness. The project page is available at https://lonelvino.github.io/SYE/.
Exploring Fixed Point in Image Editing: Theoretical Support and Convergence Optimization
In image editing, Denoising Diffusion Implicit Models (DDIM) inversion has become a widely adopted method and is extensively used in various image editing approaches. The core concept of DDIM inversion stems from the deterministic sampling technique of DDIM, which allows the DDIM process to be viewed as an Ordinary Differential Equation (ODE) process that is reversible. This enables the prediction of corresponding noise from a reference image, ensuring that the restored image from this noise remains consistent with the reference image. Image editing exploits this property by modifying the cross-attention between text and images to edit specific objects while preserving the remaining regions. However, in the DDIM inversion, using the $t-1$ time step to approximate the noise prediction at time step $t$ introduces errors between the restored image and the reference image.
OSMGen: Highly Controllable Satellite Image Synthesis using OpenStreetMap Data
Ziashahabi, Amir, Ghasemi, Narges, Shahabi, Sajjad, Krumm, John, Avestimehr, Salman, Shahabi, Cyrus
Accurate and up-to-date geospatial data are essential for urban planning, infrastructure monitoring, and environmental management. Yet, automating urban monitoring remains difficult because curated datasets of specific urban features and their changes are scarce. We introduce OSMGen, a generative framework that creates realistic satellite imagery directly from raw OpenStreetMap (OSM) data. Unlike prior work that relies on raster tiles, OSMGen uses the full richness of OSM JSON, including vector geometries, semantic tags, location, and time, giving fine-grained control over how scenes are generated. A central feature of the framework is the ability to produce consistent before-after image pairs: user edits to OSM inputs translate into targeted visual changes, while the rest of the scene is preserved. This makes it possible to generate training data that addresses scarcity and class imbalance, and to give planners a simple way to preview proposed interventions by editing map data. More broadly, OSMGen produces paired (JSON, image) data for both static and changed states, paving the way toward a closed-loop system where satellite imagery can automatically drive structured OSM updates. Source code is available at https://github.com/amir-zsh/OSMGen.
- North America > United States > California (0.14)
- Europe > United Kingdom > Scotland (0.04)
- Research Report (0.64)
- Overview (0.46)
DreamMakeup: Face Makeup Customization using Latent Diffusion Models
Park, Geon Yeong, Han, Inhwa, Yang, Serin, Hong, Yeobin, Jeong, Seongmin, Jeon, Heechan, Goh, Myeongjin, Yi, Sung Won, Nam, Jin, Ye, Jong Chul
The exponential growth of the global makeup market has paralleled advancements in virtual makeup simulation technology. Despite the progress led by GANs, their application still encounters significant challenges, including training instability and limited customization capabilities. Addressing these challenges, we introduce DreamMakup - a novel training-free Diffusion model based Makeup Customization method, leveraging the inherent advantages of diffusion models for superior controllability and precise real-image editing. DreamMakeup employs early-stopped DDIM inversion to preserve the facial structure and identity while enabling extensive customization through various conditioning inputs such as reference images, specific RGB colors, and textual descriptions. Our model demonstrates notable improvements over existing GAN-based and recent diffusion-based frameworks - improved customization, color-matching capabilities, identity preservation and compatibility with textual descriptions or LLMs with affordable computational costs.
Score Distillation via Reparametrized DDIM
While 2D diffusion models generate realistic, high-detail images, 3D shape generation methods like Score Distillation Sampling (SDS) built on these 2D diffusion models produce cartoon-like, over-smoothed shapes. To help explain this discrepancy, we show that the image guidance used in Score Distillation can be understood as the velocity field of a 2D denoising generative process, up to the choice of a noise term. In particular, after a change of variables, SDS resembles a high-variance version of Denoising Diffusion Implicit Models (DDIM) with a differently-sampled noise term: SDS introduces noise i.i.d.
- North America > United States > Massachusetts (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Singapore (0.04)
- Information Technology (0.68)
- Leisure & Entertainment > Sports (0.46)
- Automobiles & Trucks > Manufacturer (0.34)
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Vontobel, Tobias, Sadat, Seyedmorteza, Salehi, Farnood, Weber, Romann M.
Diffusion models have emerged as the leading approach for image synthesis, demonstrating exceptional photorealism and diversity. However, training diffusion models at high resolutions remains computationally prohibitive, and existing zero-shot generation techniques for synthesizing images beyond training resolutions often produce artifacts, including object duplication and spatial incoherence. In this paper, we introduce HiWave, a training-free, zero-shot approach that substantially enhances visual fidelity and structural coherence in ultra-high-resolution image synthesis using pretrained diffusion models. Our method employs a two-stage pipeline: generating a base image from the pretrained model followed by a patch-wise DDIM inversion step and a novel wavelet-based detail enhancer module. Specifically, we first utilize inversion methods to derive initial noise vectors that preserve global coherence from the base image. Subsequently, during sampling, our wavelet-domain detail enhancer retains low-frequency components from the base image to ensure structural consistency, while selectively guiding high-frequency components to enrich fine details and textures. Extensive evaluations using Stable Diffusion XL demonstrate that HiWave effectively mitigates common visual artifacts seen in prior methods, achieving superior perceptual quality. A user study confirmed HiWave's performance, where it was preferred over the state-of-the-art alternative in more than 80% of comparisons, highlighting its effectiveness for high-quality, ultra-high-resolution image synthesis without requiring retraining or architectural modifications.
- Europe > Switzerland > Zürich > Zürich (0.15)
- Europe > Austria (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
- Media (0.46)
- Leisure & Entertainment (0.46)
Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Text-guided diffusion models have significantly advanced image editing, enabling high-quality and diverse modifications driven by text prompts. However, effective editing requires inverting the source image into a latent space, a process often hindered by prediction errors inherent in DDIM inversion. These errors accumulate during the diffusion process, resulting in inferior content preservation and edit fidelity, especially with conditional inputs. We address these challenges by investigating the primary contributors to error accumulation in DDIM inversion and identify the singularity problem in traditional noise schedules as a key issue. To resolve this, we introduce the Logistic Schedule, a novel noise schedule designed to eliminate singularities, improve inversion stability, and provide a better noise space for image editing.